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Abstract 

Musso et al. (2013) predict students’ academic achievement with high accuracy one year in 
advance from cognitive and demographic variables, using artificial neural networks (ANNs). 
They conclude that ANNs have high potential for theoretical and practical improvements in 
learning sciences. ANNs are powerful statistical modelling tools but they can mainly be used for 
exploratory modelling. Moreover, the output generated from ANNs cannot be filly translated 
into a meaningful set of rules because they store information about input-output relations in a 
complex, distributed, and implicit way. These problems hamper systematic theory-building as 
well as communication and justification of model predictions in practical contexts. Modern-day 
regression techniques, including (Bayesian) structural equation models, have advantages 
similar to those of ANNs but without the drawbacks. They are able to handle numerous 
variables, non-linear effects, multi-way interactions, and incomplete data. Thus, researchers in 
the learning sciences should prefer more theory-driven and parsimonious modelling techniques 
over ANNs whenever possible. 
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Musso, Kyndt, Cascallar, and Dochy (2013) conducted a study in which the statistical modelling technique 
of artificial neural networks (ANNs) was used to predict the academic achievement of university students a 
year in advance. The measures used were attention, working memory, learning strategies, and demographic 
variables. The results were precise estimations of each student’s achievement tercile after their first year at 
university. This is an impressive success, demonstrating the usefulness of ANNs as a statistical modelling 
tool. 
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The study has raised an important question of the preferred statistical methods used by researchers in 
learning sciences. Should ANNs replace conventional statistical methods such as multiple regression, 
discriminant analysis, and structural equation modelling? - The potential of ANNs cannot be denied 
especially as a tool to examine predictive patterns in complex systems. However, Musso and colleagues 
overestimate the ability of ANNs in their application to the learning sciences. They do not mention 
shortcomings of ANNs, while overemphasizing shortcomings of competing conventional methods. 

ANNs are limited in at least two important ways. First, the construction of ANN models such as those 
used by Musso et al. is highly explorative apart from choosing relevant input and output variables (Gunther, 
Pigeot, &Bammann, 2012; Scarborough & Somers, 2006). The connection weights, which determine how an 
ANN transforms input into output patterns, are not specified by the researchers or based on theory. They are 
set to random values and changed gradually by an optimisation algorithm. This process usually involves 
thousands of iterations until each input pattern leads to the desired output pattern in the training data set. 
ANNs, thus, cannot be entirely compared to conventional methods since the latter are aimed at confirming or 
disconfirming pre-specified relations and interactions. In other words, the research question should 
determine whether the exploratory nature of ANNs is adequate, or if a conventional, confirmatory model 
should be the method of choice. 

Second, connection weights cannot be codified into a coherent set of rules that delineate the process 
by which ANNs transform input patterns into output patterns. ANNs typically have a high number of 
connections between neurons (e.g., 300 in ANN1 by Musso et al.). The transformation process of input into 
output patterns is determined by non-linear, multi-way interactions of these connection weights. Recent 
research has attempted to increase the interpretability of ANNs, for example with the help of visualizations 
for complex interactions (e.g., Cortez & Embrechts, 2013; Intrator & Intrator, 2001). However, the basic 
problem of how non-linear interactions between hundreds of variables can be understood and communicated 
in meaningful terms has not yet been solved, causing ANNs to be frequently characterised as “black boxes” 
(cf. Benitez, Castro, & Requena, 1997). While one can assess how well an ANN works, it is difficult to 
comprehensively explain why it performs well or not (Scarborough & Somers, 2006). To interpret their 
results, Musso and colleagues list an importance parameter for each predictor but these parameters do not 
explain interaction effects or non-linear relations among the variables. In addition, it is difficult to integrate 
the results of ANNs across studies and also generalise from samples to underlying populations due to the 
lack of output parameters such as standard errors and error probabilities. 

The explorative and opaque nature of ANNs impedes theory-developing and limits their practical 
application. Each relation in a statistical model should ideally correspond to a matching relation in an 
educational or psychological theory that justifies and explains the assumed statistical relation. Researchers 
can compare competing theories and advance assumptions that are not in line with the empirical data by 
fitting a series of statistical models that differ in theoretically relevant aspects (Kaplan, 1990). This is not 
possible with ANNs because the input-output relations are implicitly coded and distributed over all 
connection weights, preventing researchers from being able to map elements of an ANN and elements of a 
theory onto each other (Luger, 2009, p. 680). 

The results obtained from ANN models are also of limited use for solving real-life problems. This 
limitation can be illustrated in a situation where diagnosticians would have to tell certain high school 
students that despite achieving satisfactory levels in their current academic performances, they cannot be 
admitted to college because an ANN predicts low academic performance in the future. In justifying the 
results, the diagnosticians would have to admit that they cannot explain how the different predictors 
statistically combine, nor describe the causal processes that will contribute to the anticipated decrease in the 
students’ achievement. These limitations are unsatisfactory from diagnostic, educational, and public 
policymaking perspectives. 

Conventional methods represent more parsimonious and theory-driven alternatives to ANNs because 
they use smaller numbers of parameters, which enhances the interpretability of results. Like ANNs, modem 
regression techniques can account for non-linear relations (Bates & Watts, 2007) and complex interactions 
between variables (Aiken & West, 1991). Structural equation models are built on regression techniques and 
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allow a simultaneous analysis of numerous variables. These models can be estimated by methods that are 
robust to missing data and non-normal distributions, account for hierarchical data structures, and identify 
heterogeneous sub-populations in mixture-models (Hoyle, 2012). Especially Bayesian structural equation 
models represent a strong advancement in modelling non-linear relations, assessing unspecified relations and 
handling highly non-normal and hierarchical data (Song & Lee, 2012). In contrast to ANNs, these modelling 
techniques require explicit theoretical assumptions about the relationship of the variables and they allow for 
explicit tests of these assumptions. This might limit their predictive power compared to ANNs, but it aids 
theory-building, hypothesis testing, and the communication of model results in practical applications. 


Keypoints 

& Artificial neural networks are powerful statistical tools for pattern recognition and prediction. 

Artificial neural networks transform input patterns into output patterns by non-linear multi-way 
interactions between simulated neurons that are governed by information that is stored in 
connection weights in an implicit and distributed way. 

This “black box” nature of artificial neural networks hampers the systematic testing of theories 
and the communication of results in practical settings. 

More conventional regression-type models can also handle non-linear relations, interaction 
effects, and a high number of variables, correlated errors, missing values, and non-normal 
distributions. 

Artificial neural network analysis cannot replace conventional statistical methods in the learning 
sciences but may be applicable in specific cases. 
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